MixER: linear interpolation of latent space for entity resolution
نویسندگان
چکیده
Abstract Entity resolution, accurately identifying various representations of the same real-world entities, is a crucial part data integration systems. While existing learning-based models can achieve good performance, are extremely dependent on quantity and quality training data. In this paper, MixER model proposed to alleviate these problems. The utilizes our newly designed augmentation method called EMix. EMix map discrete entity records continuous latent space variables (e.g., probability distributions) then linearly interpolate in generate many augmented samples. matching further optimized based strengthen its generalization capability. achieves significant strengths sensitivity experiments when below 50. robustness experiments, presents an absolute performance advantage label noise exceeds 20%. addition, ablation demonstrate that developed effectively improve ability model. overall experimental results prove exhibited excellent over current state-of-the-art methods.
منابع مشابه
A Latent Dirichlet Model for Unsupervised Entity Resolution
Entity resolution has received considerable attention in recent years. Given many references to underlying entities, the goal is to predict which references correspond to the same entity. We show how to extend the Latent Dirichlet Allocation model for this task and propose a probabilistic model for collective entity resolution for relational domains where references are connected to each other....
متن کاملA Latent Dirichlet Allocation Model for Entity Resolution
In this paper, we address the problem of entity resolution, where given many references to underlying objects, the task is to predict which references correspond to the same object. We propose a probabilistic model for collective entity resolution. Our approach differs from other recently proposed entity resolution approaches in that it is a) unsupervised, b) generative and c) introduces a hidd...
متن کاملThe Effect of Transitive Closure on the Calibration of Logistic Regression for Entity Resolution
This paper describes a series of experiments in using logistic regression machine learning as a method for entity resolution. From these experiments the authors concluded that when a supervised ML algorithm is trained to classify a pair of entity references as linked or not linked pair, the evaluation of the model’s performance should take into account the transitive closure of its pairwise lin...
متن کاملInformation Space Models for Data Integration, and Entity Resolution
Geospatial information systems provide a unique frame of reference to bring together a large and diverse set of data from a variety of sources. However, automating this process remains a challenge since: 1) data (particularly from sensors) is error prone and ambiguous, 2) analysis and visualization tools typically expect clean (or exact) data, and 3) it is difficult to describe how different da...
متن کاملWarped distance for space-variant linear image interpolation
The problem of image interpolation using linear techniques is dealt with in this paper. Conventional space-invariant methods are revisited and changed into space-variant ones, by introducing the concept of the warped distance among the pixels of an image. A better perceptual rendition of the image details is obtained in this way; this effect is proved both via the evaluation of the response to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Complex & Intelligent Systems
سال: 2023
ISSN: ['2198-6053', '2199-4536']
DOI: https://doi.org/10.1007/s40747-023-01018-2